17 research outputs found

    Qualitative Action Recognition by Wireless Radio Signals in Human–Machine Systems

    Get PDF
    Human-machine systems required a deep understanding of human behaviors. Most existing research on action recognition has focused on discriminating between different actions, however, the quality of executing an action has received little attention thus far. In this paper, we study the quality assessment of driving behaviors and present WiQ, a system to assess the quality of actions based on radio signals. This system includes three key components, a deep neural network based learning engine to extract the quality information from the changes of signal strength, a gradient-based method to detect the signal boundary for an individual action, and an activity-based fusion policy to improve the recognition performance in a noisy environment. By using the quality information, WiQ can differentiate a triple body status with an accuracy of 97%, whereas for identification among 15 drivers, the average accuracy is 88%. Our results show that, via dedicated analysis of radio signals, a fine-grained action characterization can be achieved, which can facilitate a large variety of applications, such as smart driving assistants

    Adaptive Sub-Nyquist Spectrum Sensing for Ultra-Wideband Communication Systems

    No full text
    With the ever-increasing demand for high-speed wireless data transmission, ultra-wideband spectrum sensing is critical to support the cognitive communication over an ultra-wide frequency band for ultra-wideband communication systems. However, it is challenging for the analog-to-digital converter design to fulfill the Nyquist rate for an ultra-wideband frequency band. Therefore, we explore the spectrum sensing mechanism based on the sub-Nyquist sampling and conduct extensive experiments to investigate the influence of sampling rate, bandwidth resolution and the signal-to-noise ratio on the accuracy of sub-Nyquist spectrum sensing. Afterward, an adaptive policy is proposed to determine the optimal sampling rate, and bandwidth resolution when the spectrum occupation or the strength of the existing signals is changed. The performance of the policy is verified by simulations

    Towards Location Independent Gesture Recognition with Commodity WiFi Devices

    No full text
    Recently, WiFi-based gesture recognition has attracted increasing attention. Due to the sensitivity of WiFi signals to environments, an activity recognition model trained at a specific place can hardly work well for other places. To tackle this challenge, we propose WiHand, a location independent gesture recognition system based on commodity WiFi devices. Leveraging the low rank and sparse decomposition, WiHand separates gesture signal from background information, thus making it resilient to location variation. Extensive evaluations showed that WiHand can achieve an average accuracy of 93% for various locations. In addition, WiHand works well under through the wall scenario

    End-to-End Mandarin Speech Recognition Combining CNN and BLSTM

    No full text
    Since conventional Automatic Speech Recognition (ASR) systems often contain many modules and use varieties of expertise, it is hard to build and train such models. Recent research show that end-to-end ASRs can significantly simplify the speech recognition pipelines and achieve competitive performance with conventional systems. However, most end-to-end ASR systems are neither reproducible nor comparable because they use specific language models and in-house training databases which are not freely available. This is especially common for Mandarin speech recognition. In this paper, we propose a CNN+BLSTM+CTC end-to-end Mandarin ASR. This CNN+BLSTM+CTC ASR uses Convolutional Neural Net (CNN) to learn local speech features, uses Bidirectional Long-Short Time Memory (BLSTM) to learn history and future contextual information, and uses Connectionist Temporal Classification (CTC) for decoding. Our model is completely trained on the by-far-largest open-source Mandarin speech corpus AISHELL-1, using neither any in-house databases nor external language models. Experiments show that our CNN+BLSTM+CTC model achieves a WER of 19.2%, outperforming the exiting best work. Because all the data corpora we used are freely available, our model is reproducible and comparable, providing a new baseline for further Mandarin ASR research

    An Overview of End-to-End Automatic Speech Recognition

    No full text
    Automatic speech recognition, especially large vocabulary continuous speech recognition, is an important issue in the field of machine learning. For a long time, the hidden Markov model (HMM)-Gaussian mixed model (GMM) has been the mainstream speech recognition framework. But recently, HMM-deep neural network (DNN) model and the end-to-end model using deep learning has achieved performance beyond HMM-GMM. Both using deep learning techniques, these two models have comparable performances. However, the HMM-DNN model itself is limited by various unfavorable factors such as data forced segmentation alignment, independent hypothesis, and multi-module individual training inherited from HMM, while the end-to-end model has a simplified model, joint training, direct output, no need to force data alignment and other advantages. Therefore, the end-to-end model is an important research direction of speech recognition. In this paper we review the development of end-to-end model. This paper first introduces the basic ideas, advantages and disadvantages of HMM-based model and end-to-end models, and points out that end-to-end model is the development direction of speech recognition. Then the article focuses on the principles, progress and research hotspots of three different end-to-end models, which are connectionist temporal classification (CTC)-based, recurrent neural network (RNN)-transducer and attention-based, and makes theoretically and experimentally detailed comparisons. Their respective advantages and disadvantages and the possible future development of the end-to-end model are finally pointed out. Automatic speech recognition is a pattern recognition task in the field of computer science, which is a subject area of Symmetry

    Facial Expression Recognition: A Survey

    No full text
    Facial Expression Recognition (FER), as the primary processing method for non-verbal intentions, is an important and promising field of computer vision and artificial intelligence, and one of the subject areas of symmetry. This survey is a comprehensive and structured overview of recent advances in FER. We first categorise the existing FER methods into two main groups, i.e., conventional approaches and deep learning-based approaches. Methodologically, to highlight the differences and similarities, we propose a general framework of a conventional FER approach and review the possible technologies that can be employed in each component. As for deep learning-based methods, four kinds of neural network-based state-of-the-art FER approaches are presented and analysed. Besides, we introduce seventeen commonly used FER datasets and summarise four FER-related elements of datasets that may influence the choosing and processing of FER approaches. Evaluation methods and metrics are given in the later part to show how to assess FER algorithms, along with subsequent performance comparisons of different FER approaches on the benchmark datasets. At the end of the survey, we present some challenges and opportunities that need to be addressed in future

    Efficiently Passive Monitoring Flow Bandwidth

    No full text
    Using the flow-conservation law, we could reduce the number of activated monitor agents used to monitor link bandwidth usage. In this paper, we address the problem of efficiently passive monitoring flow bandwidth based on flow-conservation, which could be reduced to weak vertex cover problem. And the weak vertex cover problem is NP-hard. We give an approximation algorithm with approximation ratio 2 to solve the problem. The effectiveness of our monitoring algorithm is validated by simulations evaluation over a wide range of network topologies
    corecore